knitr::include_graphics("lightroom_ji_round.jpg")
There are several advantages to dedicate a data visualization portfolio and 10-week effort to digital currency:
Massive and Robust data source: Block chain is a public transaction log that exists as a distributed database, validated by its powerful algorithm.
Fitting data types: Block chain data includes time and IP addresses, and can be updated frequently, which means it is useful for both the static and dynamic data visualizations.
Well-supported: I can save the entire block chain file locally,use community-maintained tools to parse it, and keep the data up-to-date.
Policy-setting: Digital currencies are being used as store of value to fight inflation in turbulent economies such as Zimbabwe and Latin American countries. Central Banks are issuing experimental policies in response to the dissemination of digital currency and block chain technology.
Here I will explore some data sources published by major Bitcoin exchanges and platforms as an market overview.
rectangle <- data.frame(xmin = as.POSIXct(c("2017-03-25")),
xmax = as.POSIXct(c("2017-10-21")),
ymin = 0,
ymax = Inf)
line <- ggplot(data=btc_price) +
geom_line(size=.25,
aes(Date, Value),
color="#325a8c") +
geom_vline(xintercept = as.POSIXct("2017-03-25"),
colour="#ff7575",
size=.5,
linetype="dotted",
alpha=.75) +
scale_y_continuous(name ="Bitcoin Market Price in USD",
breaks = (seq(0,6000,1000))
) +
scale_x_datetime(date_breaks = "1 year",
labels = seq(2009,2018,1),
expand=c(0,0)) +
ggtitle("Bitcoin Price",
subtitle = "experienced the greatest surge in history this year") +
labs(caption = "Source: Blockchain.com") +
annotate(geom="text",
x=as.POSIXct("2016-01-25"),
y=2150,
label="Surge started on Mar 25, 2017",
colour="#ff7575",
family="Avenir",
fontface="bold",
alpha=1) +
theme_jiye
# add surge bar
line + geom_rect(data = rectangle, aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax),
fill = "#ff7575", alpha = 0.1)
The price of Bitcoin can increase or decrease drastically over a short period of time. This graph mainly serves as a preliminary evidence to pinpoint certain time points where a new policy or event may have impacted the price of Bitcoin.
# graph 2-1: nodes map (updated week 5)
nodes_map <- ggplot(data = joined, aes(x = long, y = lat, group = group, fill = cut(node_count, c(0, 100, 500, 1000, 2000, Inf)))) +
geom_polygon() +
coord_map("gall", 45, ylim = c(90, -60)) +
theme_clean_map() +
scale_fill_manual(values=c("#dddddd", "#bbbbbb", "#999999", "#777777", "#555555"),
name = "Number of Nodes",
labels=c("less than 1% of the total nodes",
"100 to 500",
"500 to 1000",
"1000 to 2000",
"2000 and more"))
nodes_map <- nodes_map +
ggtitle("Global Nodes Distribution is Uneven Across Continents",
subtitle = "Africa and the Middle East is largely absent") +
labs(caption = "Source: BitNodes API") +
theme_map + theme(legend.position = c(0.15, 0.15),
legend.title = element_text(family = "Avenir"),
legend.text = element_text(family = "Avenir"),
legend.direction = "vertical")
nodes_map
#ggsave("gnd.pdf", plot = nodes_map, device="pdf")
knitr::include_graphics("global_nodes_dist.png")
I like this NA == White specification much more than my previous design. It really make the absence of Africa and the Middle East obvious - They’re simply dissapeared from sight in the enclosed world of Bitcoin economy. And this visualization grasp that.
Bitcoin uses “nodes” to validate transactions and secure the network. Nodes the Bitcoin core client on a machine instance with the complete block chain. The more functioning nodes there are, the more secured the network is. This maps shows the geographic distribution of nodes. Note that not only it is uneven regionally, there is also an discrepancy between where the nodes are distributed (North America & Europe) and where Bitcoin could perhaps help people in economic turbulence, or lacking access to financial resources more than anywhere else (Africa).
# graph 2-2: percentage bar
# locks factor level
top_country_nodes$country <- factor(top_country_nodes$country, levels = top_country_nodes$country)
nation_cols2 <- c(
"#93B7BE","#584F50","#b3906a","#FF991C",'#fb4d3d',
"#999999","#999999","#999999","#999999","#999999",
"#999999","#999999","#999999","#999999","#999999",
"#dddddd"
)
perc_bar <- ggplot(data = top_country_nodes, aes(x="", y = node, fill=factor(country))) +
geom_bar(stat='identity', position = position_stack(reverse = TRUE), width=.75, colour="white") +
coord_flip() +
scale_fill_manual(values=nation_cols2) +
theme_percBar
perc_bar + scale_y_continuous(name="",
labels=c("0%", "USA\n29.62%", "Germany\n17.16%", "France\n6.65%", "Netherlands\n5.19%", "China\n5.11%", "< Countries over 1%", "Rest of the World\n13.97%"),
breaks = (c(0, 2962, (2962+1716), (2962+1716+665), (2962+1716+665+519), (2962+1716+665+519+511),9973-1397,
sum(top_country_nodes$node))),
expand=c(0,0)
) +
labs(caption = "Source: BitNodes") +
ggtitle("Block Chain Network is Distributed",
subtitle = " in high GDP countries, with the Middle East underrepresented")
The node-by-country ranking shown here is very well aligned with the country/region ranking of Global Financial Centres Index, which measures financial competitiveness based on “business environment”, “financial sector development”, “infrastructure factors”, “human capital”, “reputation and general factors”. These aspects are helpful in thinking about why some of high GDP countries are not hosting Bitcoin nodes. For example, Middle Eastern countries may lack the human capital needed; while India may fall short in the financial sector development and infrastructure factors. If the correlations are not coincidental, they can be used as a reference for building a “cryptocurrency friendliness” index.
# graph 2-3: miner pool
#pool_perc_bar <- ggplot(data = miner_company, aes(x="", y = blocks_mined, fill=factor(miner_pool))) +
# geom_bar(stat='identity', position = position_stack(reverse = TRUE), width=.75, colour="white") +
# coord_flip() +
# scale_fill_discrete() +
# theme_percBar + theme(legend.position = "bottom")
# pool_perc_bar
perc_china <- miner_country[miner_country$country_of_origin == "China", "blocks_mined_perc"][["blocks_mined_perc"]]
perc_Czech <- perc_china + miner_country[miner_country$country_of_origin == "Czech Republic", "blocks_mined_perc"][["blocks_mined_perc"]]
perc_Georgia <- perc_Czech + miner_country[miner_country$country_of_origin == "Georgia", "blocks_mined_perc"][["blocks_mined_perc"]]
perc_india <- perc_Georgia + miner_country[miner_country$country_of_origin == "India", "blocks_mined_perc"][["blocks_mined_perc"]]
perc_Norway <- perc_india + miner_country[miner_country$country_of_origin == "Norway", "blocks_mined_perc"][["blocks_mined_perc"]]
perc_Rus <- perc_Norway + miner_country[miner_country$country_of_origin == "Russia", "blocks_mined_perc"][["blocks_mined_perc"]]
country_perc_bar <- ggplot(data = miner_country, aes(x="", y = blocks_mined_perc, fill=factor(country_of_origin))) +
geom_bar(stat='identity', position = position_stack(reverse = TRUE), width=.75, colour="white") +
coord_flip() +
scale_fill_manual(values=c("#fb4d3d", "#74B9D5", "#F27A30", "#FCCA46", "#9ECF9A", "#002865", "#999999"))
country_perc_bar + scale_y_continuous(name="",
labels=c("", "China 76.1%", "Czech Rep 7.3%", "Georgia 4.1%", "India 2.4%", "Norway 3.7%", "", "Global or Unknown origin 6.1%"),
breaks = c(0, perc_china, perc_Czech, perc_Georgia, perc_india, perc_Norway, perc_Rus, 1),
expand = c(0,0)) +
theme_percBar + theme(legend.position = "none",
axis.text.x = element_text(angle = 30)
) +
ggtitle("If All the Bitcoin Miners Were a Person...",
subtitle = "He is basically Chinese, with some Second World lineage, one Indian forearm, and a full-blood transfusion from anonymous donors.") +
labs(captions = "Source: Blockchain.info")
Here I have a much simpler data to visualize, and find the percentage bar fitting for presenting it. In the title writing, I experienced with the “weird comparison” discussed by Amanda Cox in her talk, in hope of leaving the viewer with something more memorable from this simple graph.
# note: color #85bb65 is nicknamed as "dollar bill green".
# color #FF9900 is the orange from bitcoin logo.
# ISSUE:Both colors are ugly af. Switch soon.
# ISSUE:graph doesn't show how USD volume precedes BTC clear enough.
area <- ggplot(data=agg, aes(x = as.POSIXct(date), y = volume, fill = as.factor(unit))) +
geom_area(position = "stack") +
scale_x_datetime(date_breaks = "3 months", expand=c(0,0)) +
scale_y_continuous(breaks = c(0, 250, 500, 750, 1000, 2000, 3000)) +
scale_fill_manual(values = alpha(c("#FF9900", "#85bb65")), 0.2) +
ggtitle("Bitcoin Daily Trading Volume",
subtitle = "in number of bitcoin & Thousands of USD") +
labs(caption = "Source: Coinbase API",
x = "",
y = "Volume") +
theme_volume
area
In this graph, we can see that Bitcoin’s daily trading volume is extremely volatile, perhaps even more so than its price. This graph is indicative in three aspects:
This level of volatility can be a sign for market manipulation, especially given Bitcoin’s young economy, novel nature, and lack of regulation.
This graph can help identify the time points where major policy changes that either encourage (spike) or discourage (trough) speculation.
Most interesting trend in this graph is that whenever there is a big spike, the USD volume spikes before the Bitcoin spike, which is more buyers enters the market as the price rises, potentially based on the optimistic psychology.
# ISSUE: Probably shouldn't use Heatmap.
# ISSUE: Needs more deliberate, consistent color scheme
heatmap <- ggplot(soc, aes(x=rater, y=ratee, fill=score)) + geom_tile() +
scale_fill_distiller(type="div", palette = "RdYlBu") +
ggtitle("The trust nework of Bitcoin traders",
subtitle = "is sparse and unregulated") +
labs(caption = "Source: Bitcoin OTC trust network by Stanford University") +
theme_heatmap
heatmap
While sparse and barely visible, this graph provides some insights about the Bitcoin market. First, it allows us to identify some super raters and ratees, who are more likely to be the major player capable of manipulating Bitcoin price. Second, although orange (trustworthy) is more prevalent, there are quite a few blue spots on this heatmap, marking the users that are rated as fraud by others. At a glimpse, fraud is frequent in digital currency market, which underscores its unregulated nature.
# ISSUE: cannot knit this code chunk
net <- graph_from_data_frame(d=links, directed = T)
# set color
col.1 <- adjustcolor("#666666", alpha=0.4)
col.2 <- adjustcolor("#000000", alpha=0.4)
#edge.pal <- colorRampPalette(c(col.1, col.2))#, alpha = TRUE)
#edge.col <- edge.pal(100)
edge.col <- col.1
colrs_transparent <- adjustcolor("#FFFFFF", alpha=0)
net_simplified <- simplify(net,remove.multiple = F,edge.attr.comb=list(Weight="sum"), remove.loops = T)
l <- layout_on_sphere(net_simplified)
plot(net_simplified,
layout=l,
edge.arrow.size=0.1,
vertex.size=2,
vertex.color="#FF9900",
#vertex.frame.color=colrs_transparent,
vertex.label=NA,
edge.color="grey50",
main="The majority of long-term Bitcoin users does not engage in transaction",
sub="Data source: ELTE Bitcoin Project website and resources (2009 - 2013)"
)
legend(x=-0.3, y=-1.1, c("long-term bitcoin user"), col="#FF9900", pch=21, pt.bg="#FF9900")
This is a graph that visualizes the transaction activities of several thousand long-term Bitcoin users during 2009 - 2013. We can see that the majority is inactive, with one user radiating assets to many others, and a few others repeatedly send money to themselves. Not only can this network graph serves as a piece of evidence to show that most people purchase Bitcoin for the purpose of speculation, it also shows that, while the technological philosophy of Bitcoin is decentralization, its ownership and commercial activity is very much the opposite.
knitr::include_graphics("signature_YeJi.png")